Assumptions behind Intercoder Reliability Indices
نویسندگان
چکیده
Intercoder reliability is the most often used quantitative indicator of measurement quality in content studies. Researchers in psychology, sociology, education, medicine, marketing and other disciplines also use reliability to evaluate the quality of diagnosis, tests and other assessments. Many indices of reliability have been recommended for general use. This article analyzes 22, which are organized into 18 chance-adjusted and four non-adjusted indices. The chance-adjusted indices are further organized into three groups, including nine category-based indices, eight distribution-based indices, and one that is double based, on category and distribution. The main purpose of this work is to examine the assumptions behind each index. Most of the assumptions are unexamined in the literature, and yet these assumptions have implications for assessments of reliability that need to be understood, and that result in paradoxes and abnormalities. This article discusses 13 paradoxes and nine abnormalities to illustrate the 24 assumptions. To facilitate understanding, the analysis focuses on categorical scales with two coders, and further focuses on binary scales where appropriate. The discussion is situated mostly in analysis of communication content. The assumptions and patterns that we will discover will also apply to studies, evaluations and diagnoses in other disciplines with more coders, raters, diagnosticians, or judges using binary or multi-category scales. We will argue that a new index is needed. Before the new index can be established, we need guidelines for using the existing indices. This article will recommend such guidelines. ASSUMPTIONS BEHIND INTERCODER RELIABILITY INDICES 4 Annals of the International Communication Association, 36(1), 419–480. http://doi.org/10.1080/23808985.2013.11679142 Table of
منابع مشابه
Problems for Reliable Discourse Coding Systems
Focusing on issues of intercoder reliability, this paper describes problems experienced in designing coding systems that classify language using discourse-relevant categories. First, given the absence of a consensus among language scholars, we examine options for selecting and structuring code categories, particularly those which have an impact on intercoder reliability. We observe that compute...
متن کاملAchieving Intercoder Reliability with a Taxonomy of Speech Acts
This paper presents 1) a means of developing a decision procedure to help human taggers label utterances with speech acts, signiicantly increasing the chances of achieving intercoder reliability, and 2) a principled method for selecting a set of speech acts that successfully addresses a tradeoo between distinguishable speech acts at a high level of abstraction and informative speech acts at a l...
متن کاملMapping the categories of the Swedish primary health care version of ICD-10 to SNOMED CT concepts: Rule development and intercoder reliability in a mapping trial
BACKGROUND Terminologies and classifications are used for different purposes and have different structures and content. Linking or mapping terminologies and classifications has been pointed out as a possible way to achieve various aims as well as to attain additional advantages in describing and documenting health care data. The objectives of this study were: to explore and develop rules to be ...
متن کاملTowards an Empirical Model of Argumentation in Medical Genetics
We present a coding scheme, based on a Bayesian Network (BN) formalism, for describing probabilistic and causal information in arguments in medical genetics. The scheme was applied to a corpus of genetic counseling letters and evaluated for intercoder reliability. Results show that the model is highly relevant to the corpus while intercoder reliability of the coding scheme is good. We plan to u...
متن کاملIntercoder reliability in annotating complex disfluencies
In previous work, we presented an annotation scheme that can describe complex disfluencies. In this paper, we first show the prevalence of complex disfluencies and illustrate the types of distinctions that our scheme allows. Second, we present an annotation tool that allows the scheme to be easily applied. Third, we present the results of a reliability study in annotating complex disfluencies w...
متن کامل